Skip to content

[metrics] Add client metrics and prometheus example#617

Open
charlesdong1991 wants to merge 2 commits into
apache:mainfrom
charlesdong1991:metrics-doc-update-and-prometheus-example
Open

[metrics] Add client metrics and prometheus example#617
charlesdong1991 wants to merge 2 commits into
apache:mainfrom
charlesdong1991:metrics-doc-update-and-prometheus-example

Conversation

@charlesdong1991

@charlesdong1991 charlesdong1991 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Purpose

Document the implemented connection/writer/scanner client metrics and show how to collect them with a Prometheus exporter.

Linked issue: close #618

@charlesdong1991

Copy link
Copy Markdown
Contributor Author

cc @fresh-borzoni @leekeiabstraction PTAL! 🙏

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the PR. Left a question

"expected fluss_client_writer_records_send_total >= {rows_per_iter}, got {sent:?}\n{rendered}"
);
println!("self-check OK: records_send_total = {}", sent.unwrap());
verified = true;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to break out of the loop once verification is done? As is, the example would run indefinitely (is that your intention?)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it was the idea actually, because it's for prometheus demo, so how it usually works is the process should stays alive so people can scrape metric endpoint e.g. something like localhost:xxxx/metrics

if we break out of loop, then endpoint will be taken down, so users cannot scrape metrics anymore...

this verify flag was just a check to confirm recorder is wired up, and not supposed to be used for stop condition. if users really want to stop, usually, they should ctrl-C which also reflects how such prometheus setup is working

i can add a bit more explanation in code to make it clearer if you think makes sense @leekeiabstraction 🙏

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, maybe to avoid confusion, examples are not invoked in any CI, so this indefinite loop won't hang CI i believe

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I may have misremembered that some examples are being ran during CI to ensure their correctness. This is fine if it is not ran during CI. If we do that eventually, it might be worth having it run for something like 60s instead.

@leekeiabstraction leekeiabstraction left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for clarifying. Left a couple more qs.


let scan_records = log_scanner.poll(Duration::from_secs(1)).await?;
let mut count = 0;
for record in scan_records {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are only verifying metrics for sent records, do we need the loop for polling records? If not, it might be worth removing this loop for a more concise example.

The client caches metric handles the first time a writer or scanner is created,
binding them to whichever recorder is installed at that moment. Install your global
recorder **before** calling `FlussConnection::new` (ideally as the very first thing
in `main`). If you install it after creating a writer or scanner, those metrics will

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behaviour might catch user off guard. Is it worth making it so that an Error is returned instead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[metrics] Add client metrics reference and prometheus example

2 participants